Using Argumentative Zones for Extractive Summarization of Scientific Articles

نویسندگان

  • Danish Contractor
  • Yufan Guo
  • Anna Korhonen
چکیده

Information structure, i.e the way speakers construct sentences to present new information in the context of old, can capture rich linguistic information about the discourse structure of scientific documents. Information structure has been found useful for important Natural Language Processing (NLP) tasks, such as information retrieval and extraction. Since scientific articles typically follow a certain discourse structure describing the prior work, problem being solved, methods used, and so forth, it could also be useful for summarization of these articles. In this work we focus on a scheme of information structure called Argumentative Zoning (AZ), and investigate whether its categories could support extractive text summarization in a scientific domain. We develop a summarization system that uses AZ categories (i) as features and (ii) in the final sentence selection process. We evaluate the system directly as well as using task-based evaluation. The results show that AZ can support both full document and customized summarization. We report a statistically significant improvement in summarization performance against a competitive baseline that uses journal section labels instead of AZ information. TITLE AND ABSTRACT IN MANDARIN 一种根据“论证结构”自动摘录科技文献的方法 信息结构是指作者组织语句陈述信息的方式。信息结构例如科技文献的篇章结构包含丰富 的语言信息,有助于解决自然语言处理领域的一些重要问题例如信息检索和信息提取等。 科技文献通常使用特定的篇章结构来陈述以往的研究,阐述研究问题以及研究方法等等, 这些篇章结构可以被用于文献的自动摘录。本文着眼于一类特定的信息结构—“论证结构 ”,研究其是否有助于更好地摘录科技文献。在本文开发的摘录系统中,“论证结构”有 两种用途:一是作为特征供机器学习,二是用于最终的语句筛选过程。本文对该系统进行 了直接和间接的评测,测试结果显示“论证结构”有助于更好地对全文或指定信息进行摘 录。基于“论证结构”的摘录系统显著性优于基于章节标题的摘录系统。

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topical Coherence for Graph-based Extractive Summarization

We present an approach for extractive single-document summarization. Our approach is based on a weighted graphical representation of documents obtained by topic modeling. We optimize importance, coherence and non-redundancy simultaneously using ILP. We compare ROUGE scores of our system with state-of-the-art results on scientific articles from PLOS Medicine and on DUC 2002 data. Human judges ev...

متن کامل

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

Surveyor: A System for Generating Coherent Survey Articles for Scientific Topics

We investigate the task of generating coherent survey articles for scientific topics. We introduce an extractive summarization algorithm that combines a content model with a discourse model to generate coherent and readable summaries of scientific topics using text from scientific articles relevant to the topic. Human evaluation on 15 topics in computational linguistics shows that our system pr...

متن کامل

Automatic Argumentative-Zoning Using Word2vec

In comparison with document summarization on the articles from social media and newswire, argumentative zoning (AZ) is an important task in scientific paper analysis. Traditional methodology to carry on this task relies on feature engineering from different levels. In this paper, three models of generating sentence vectors for the task of sentence classification were explored and compared. The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012